\section{Lower Bound}
\label{sec:lowerbound}

As demonstrated in the previous section, we need to ensure an error of at most $O(1/\sqrt{n})$ in the computation of $D$ to be able to conduct the KS-test with reasonable accuracy. In this section we will show that at least $\Omega(\sqrt{n})$ information must be retained by a sketch for estimating $D$ to attain this accuracy. This shows that our $O(\sqrt{n}\log{n})$ upper bound is nearly tight.

\begin{theorem}
\label{thm:lowerbound}
Any sketch for computing the KS-statistic to within $O(1/\sqrt{n})$ precision must use $\Omega(\sqrt{n})$ space.  
\end{theorem}

\begin{proof} 
Consider a data set in which the $n$ points take on $s \leq n$ distinct values. Any sketch that does not retain at least one sample from each of these $s$ distinct values will have an error of $1/s$ at some point along the c.d.f.\ of the data. This is easy to derive and is left as an exercise for the reader. Since we need to be able to compare against arbitrary distributions, this results in an error of  $1/s$ in the estimation of the KS-statistic. Substituting $s = \Theta(\sqrt{n})$ gives the desired result. 
\end{proof}
